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ABSTRACT 

Upcoming large-scale ground- and space- based supernova surveys will face 
a challenge identifying supernova candidates largely without the use of spec- 
troscopy. Over the past several years, a number of supernova identification 
schemes have been proposed that rely on photometric information only. Some of 
these schemes use color-color or color-magnitude diagrams; others simply fit su- 
pernova data to models. Both of these approaches suffer a number of drawbacks 
partially addressed in the so-called Bayesian-based supernova classification tech- 
niques. However, Bayesian techniques are also problematic in that they typically 
require that the supernova candidate be one of a known set of supernova types. 
This presents a number of problems, the most obvious of which is that there 
are bound to be objects that do not conform to any presently known model in 
large supernova candidate samples. We propose a new photometric classification 
scheme that uses a Bayes factor based on color in order to identify supernovae 
by type. This method does not require knowledge of the complete set of pos- 
sible astronomical objects that could mimic a supernova signal. Further, as a 
Bayesian approach, it accounts for all systematic and statistical uncertainties of 
the measurements in a single step. To illustrate the use of the technique, we 
apply it to a simulated dataset for a possible future large-scale space-based Joint 
Dark Energy Mission and demonstrate how it could be used to identify Type la 
supernovae. The method's utility in pre-selecting and ranking supernova candi- 
dates for possible spectroscopic follow-up - i.e., its usage as a supernova trigger 
- will be briefly discussed. 

Subject headings: supernovae: general - techniques: photometric 
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Introduction 



In recent years, the question of photometric identification of supernova candidates has 
emerged as one of the crucial issues to be resolved before the advent of large-space su- 
pernova cosmology experiments, both ground-based {e.g., the Large Synoptic Survey Tele- 
scope [LSST], the Dark Energy Survey [DES], the Panoramic Survey Telescope and Rapid 
Response System [Pan-STARRS]), and space-based {e.g., the Joint Dark Energy Mission 
[JDEM]). There are a number of reasons for this. First, although there have been some 
inter esting developments in the possible uses o f supernova other than Type la for c osmol- 
ogy jBaron et al.lbood : Iflamuv and Pintolbooi iNugentI liooel : IPoznanski et alibopgh . Type 
la supernovae (SNIa) remain the staple of experimental cosmology. Second, SNe la are 
most reliably identified using spectroscopy due to the presence of a characteristic Sill line 
at 6150 A in the supernova rest frame. However, future large ground-based surveys are 
expected to collect thousands of supernova candidates, making a spectroscopic follow-up of 
each candidate all but unrealistic. The identification of supernova candidates (with possible 
spectroscopic follow-up for a select sample) based on broadband photometry remains the 
only feasible alternative. 

There have been a number of methods proposed to identify supernovae using broad- 
band photometry that can be divided into three broad categories. One includes methods 



that rely on color-color or color-magnitude diagrams (IPoznanski et al.l 120021 : iRiess et al. 



2004J : iJohnson and CrottsI l2005l : ISuUivan et al.l l2006al ). It is also possible to fit supernova 
data to models, and select the best fit (us i ng, for example, a y^), which can b e used to rep- 
resent the supernova type (jjha et al.ll2007l : iGuy et al.ll2007l : IConley et al.ll2008l ). Finally, the 
third category involves recently developed techniques based on a probabilistic (Bayesian) ap- 
proach to the problem ( Kuznetsova and Connollyil2007l : IPoznanski et al.ll2007l ). The method 
proposed in this work, although closer in spirit to the second category, has a number of 
advantages over both. 

The existing techniques, while adequate in many cases, have a number of serious short- 
comings. For example, supernova identification schemes based on color-color and color- 
magnitude diagrams involve comparing the colors and/or magnitudes of a given supernova 
candidate with what is predicted by various supernova models. This is an intuitive approach, 
allowing one to visually judge the goodness of fit of the data to the models; however, it is 
difficult to account for all statistical and systematic uncertainties in a single step. 

A class of techniques that could be generally described as "x^-based" simply find the 
best fit for a given supernova candidate's light curves to a supernova model. This is also an 
intuitive and often completely reasonable approach, which nevertheless suffers the following 
disadvantages: 
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1. "This object is not a supernova of any kind" is no t a well-defined hypothesis in this as 
in any other frequentist approach (jEdwardslll992l ). 



2. 



3. 



Conversely, if the data happen to have large uncertainties, there is the possibility that 
a number of supernova models will be good fits to the data. There is no formalism to 
compute not only the probability that a given fit is good, but also that it is bad. In 
other words, what one is interested in is the posterior probability, the probability that 
a given hypothesis is true given the data. Calculating this probability requires that 
the probability that this hypothesis is false be also known. 

Using a x^-based technique only gives the information about the best fit for a given 
set of data to a model, while any information about worse fits is lost. The best fit will 
not necessarily reflect the true properties of the supernova. 

In cases where one would like to use a tail probability for accepting or rejecting given 
supernova candidates {e.g., as SNe la), the probability of falsely rejecting the null 
hyp othesis (the so called Type I error rate) can be shown to be severely underestimated 



see 



Sellke. Bayarri. fc Bergerl (120011 ) and references therein) 



Bayesian classification schemes address many of the problems of the above-mentioned 
methods. However, existing Bayesian-based supernova typing methods have a serious draw- 
back: they require the knowledge of the complete set of objects that a super nova candidate 
might conceivably be (IKuznetsova and Connollyll2007l : iPoznanski et al.ll2007l ). That is, they 
assume that a supernova candidate can only be one of a finite set of supernova types. How- 
ever, even with the current small high-redshift SN sample (obtained almost exclusively with 
the Hubble Space Telescope) one occasionally finds supernova candi dates with surprisiri g 
new properties that do not seem to conform to any known models ( Barbary et al.ll2009l ). 
Problems with assuming a finite set of supernova-like objects are further addressed in Sec- 
tion [331 

In our work, we introduce a likelihood ratio (a Bayes factor) that is capable of discrim- 
inating between SNe la and anything else based on broadband photometric measurements. 
The most important feature of this technique is that it is independent of the knowledge 
of the complete set of objects that a supernova candidate might conceivably be. Another 
advantage is that, as with all Bayesian-based techniques, this method allows one to include 
all possible statistical and systematic uncertainties in a single step. Finally, the Bayes factor 
is formulated in terms of color and thus does not require that one make any assumptions 
about the absolute magnitudes of the supernova candidates in the broadband filters used. 
Of course it is often desirable to include magnitudes in the formalism; however, not only 
does it require making assumptions about the distribution of magnitudes for various known 
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supernova types, but also it places a hard upper limit on the intrinsic magnitudes of objects 
that have yet to be observed. But more importantly, "anomalous" (non-supernova) objects 
can be defined in a far more mathematically elegant and computationally manageable way 
using color alone. 

The Bayes factor is defined as 

7^ = P(Phot|Ia)/P(Phot|non-Ia). (1) 

where P(Phot|Ia) is the probability of obtaining the observed photometry (colors) from a SN 
la, and P (Phot | non-la) is the probability of obtaining the data for any other object (which 
could be a non-SN la or any other object capable of mimicking an SN la signal). Both 
probabilities take into account the relative distribution of light among the broadband filters 
used for the measurements. In general, no specific set of models (or templates) for non-SNe 
la is required for the calculation of the denominator. 

On a more technical note, it is worthwhile to point out that Bayes factors are normally 
used for deciding on the best of two hypotheses. This allows one to easily set thresholds on 
the Bayes factor in terms of the so-called Typ e I an d Type II er ror rate^j in the same way 



as thresholds are set on the likelihood ratio in IWaldl (119451 . 119471 ). Also, although the main 
focus of this work is to describe a method that can identify SNe la, the Bayes factor can be 
easily cast in terms of a posterior probability that a candidate is a Type T supernova, where 
T could be Ibc, II-P, Iln, etc.. @ 

This paper is organized as follows. In Section [2] we derive an expression for TZ for a 



^Type I error is the probability of rejecting the null hypothesis when the null hypothesis is in fact correct; 
it is thus a measure of the purity of the selection. Type II error is the probability that the null hypothesis 
will be accepted when the null hypothesis is in fact false; it is thus a measure of the efficiency of the selection 

^Consider some data D, a hypothesis Ho, and its alternative Hi- The Bayes factor can be defined as 

The posterior probability that the alternative hypothesis is true for the data can then be written in terms 
of TZ provided that one knows the priors for Ho and Hi, denoted by P{Ho) and P{Hi), respectively: 



p(Hiii^)= pg;-; . (3) 



See iBerger and Pericchil l|200lh for details. Although there are historical reasons why the Bayes factor is 
formulated in this way, it is also convenient when setting thresholds for the error rates because often the 
errors rates are at least somewhat determined by the information contained in the priors on Hq and Hi- 
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number of different cases. We describe the performance of the method in Section [31 Section H] 
presents a discussion of the results. 



The Bayes factor, TZ, introduced above, is defined as the probabihty of obtaining the 
photometric measurements assuming that the supernova candidate is a Type la over the 
probabihty that it is anything else. In practice, the probability that a candidate is an SN 
la is the probability that the colors are consistent with what is expected for an SN la using 
some prior knowledge about the behavior of Type la's. In our first formulation of the Bayes 
factor, if the candidate is not in fact an SN la, then the distribution of light in the broadband 
filters used can be arbitrary. However, one could argue that much of the background for 
SNe la will be supernovae of other types whose behavior is relatively well-known. However, 
the unprecedented scale of the future supernova surveys makes it highly likely that many 
new types of transient objects will be discovered. Also, little is known about the rates of 
non-Type la supernovae, especially at very high redshifts, making it difficult to predict the 
behavior of the background at those redshifts. 

We begin with a general overview of the calculation of TZ. For simplicity, we assume 
that there are only two broadband filters, and that there is a single measurement of the 
supernova candidate's fiux in each. Suppose that the fiux is measured in photon counts, 
and that Mi counts are measured in the first filter, and M2, in the second. Further suppose 
that there exists a model (a template) for the behavior of SNe la in these filters, and that 
the model predicts that some mean fraction of photons, /, must end up in the first filter, 
and 1-/, in the second. The numerator of TZ, P(Phot|Ia), is essentially the probability that 
the measurement is consistent with this model. Assuming Poisson statistics for the photon 
distributions, it can be easily shown that P (Phot | la) takes the form of a standard binomial 
distribution: 



For the calculation of the denominator of TZ, P (Phot | non-la), we do not make any a priori 
assumptions about the fraction of light that will end up in either filter. The Bayesian 
framework of the calculation allows one to circumvent this difficulty by marginalizing, or 



2. Derivation of the Bayes Factor 



2.1. 



Overview of the Calculation 
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integrating over, all possible fractions. Mathematically, 

P(Photlnon-Ia) = ^/ m,^') /''Hi " /)^^ 

In reality, the calculation becomes rather more complicated. To begin with, the mea- 
sured flux will most likely be better described using Gaussian, rather than Poisson, statistics. 
We must also allow for the possibility of multiple measurements and more than 2 filters. In 
the next section we will make the the calculation more explicit and account for all of these 
factors. 



2.2. Mathematical Details 

Before we plunge into the full derivation of TZ for the case of Gaussian statistics and 
multiple measurements and filters, we take a closer look at the simple case of a single mea- 
surement of a supernova candidate in just two filters, assuming that the photon count fluctu- 
ations in the filters are Poisson. Recall that we assume that Mi counts are measured by the 
first filter and M2 by the second; and that we have a model that predicts a certain fraction 
of photons, /, for the first filter, and (1 — /), for the second. 



Following a similar derivation in I Jeffreys! (jl96ll ). let us now introduce two variables, / 



and b, such that the mean number of photons in the first filter is given by fb, and the mean 
number of photons in the second filter is given by (1 — f)b. Variable b ranges from to 00, 
and can be thought of as the mean number of photons that are counted in both filters for a 
given measurement. Variable / ranges from to 1, and can be thought of as the probability 
that the photons will end up in the first filter as opposed to the second. An analogy would 
be collecting balls into two receptor bins with different volumes: in this case, b would be 
the mean number of balls that will enter both bins, and / is the relative "acceptance" of 
one bin. The introduction of these variables allows us to expand the Bayes factor, Eqn.[Tl in 
terms of / and b: 

^ ^ /o°°^&P(Phot|&,Ia)P(&|Ia) 

/o ci6P(Phot|/, 6, non-Ia)P(/, 6|non-Ia) ■ 

Here, the first term in the numerator, P(Phot|6, la), is the likelihood of obtaining the 
measurement given that the mean number of photons was measured to be fb in the first filter, 
and (1 — f)b in the second. Likewise, the first term in the denominator, P(Phot|/, b, non-la), 
is the likelihood of obtaining the measurement given that the mean number of photons was 



- 7- 



measured to be fb in the first filter, and (1 — f)b in the second. Note that the numerator 
is not a function of / because / is single valued in the numerator, / = /. If the photon 
distribution is governed by Poisson statistics, then: 

and 

P(Phot|/, 6, non-la) = ^^Z, • (6) 

^ ' ^ Ml! M2! ^ ^ 

The terms P(6|Ia) in the numerator and P(/, 6 1 non-la) in the denominator of Eqn.|H 
are prior probabilities containing information regarding the expected distribution of light in 
the two filters for an SN la and anjd;hing else, respectively. Defining bmin and bmax as the 
minimum and maximum bounds for b and assuming each value for b in between these bounds 
is equally probable, we have: 

P(6|Ia) = i-— . (7) 



-'max "mm 



Note that the range of b will always be assumed to be 6 = [0, 00], although the upper and 
lower bounds will initially be set to bmax and 6mm, respectively. § However, if the candidate 
is not an SN la, we do not make any assumptions about what to expect, and so: 

P(/, 6|non-Ia) ^ ^ 



(bmax bmin) (f max fmin) b^nax bmin 

as the upper (fmax) and lower (fmin) bounds of / are 1 and 0, respectively. 

Note that P(6|Ia) and P(/, 6|non-Ia) are improper priors (in other words, they assume 
probability density functions that are flat and are integrated from zero to infinity). This 
is not a major issue for our calculation because the priors hap pen to cancel. However, i n 



general the use of improper priors must be treated with caution flBerger and Pericchill200ll ). 
It is therefore important to check that TZ does indeed behave properly; this will be addressed 
further in Section [3?2l 



•^Here we have adopted Jaynes' methodology ( Javnes and Bretthorst 2003h . expressing the prior proba- 



bihties in terms of variables representing the bounds of those variables. These bounds are inserted at the 
end of the calculations (integrations) with the goal of avoiding handling variables whose limits arc defined 
as [0,00) - i.e., to avoid priors whose probabilities approach 0. 
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With these priors, Eqn. H] becomes: 



7^ 



1 



(9) 



M1+M2+1 



In the calculation of Eqn. [9] b is effectively unconstrained, leaving the supernova can- 
didate's magnitude free to take on any value. That is, as we are only concerned with the 
relative fraction of photons in each filter {i.e., color), we need not make any assumptions 
about the behavior of b. 

We would now like to derive an equation analogous to Eqn. [9|, but for the case of 
Gaussian statistics. Let us suppose that instead of measuring Mi photons in the first filter 
and M2 in the second, we now measure a flux Fi in the first filter with an error ai, and a 
flux F2 in the second filter with an error (72. As before, we parametrize the mean (or "true") 
fluxes in the two filters as fb and (1 — f)b, and expand TZ in terms of / and b, leading to 
Eqn. m We then simply replace the Poisson distributions with Gaussian ones using the usual 

notation for a Gaussian distribution, G{x; fi, a) - 



(tJ.-x) 



2-Ka 



e 2<t2 . Equation!!] becomes 



7^ 



dbG{Fi- fb, ai)G{F2; (1 - f)b, a^] 



/; df J^^ dbG{Fi- fb, ai)G{F2; (1 - f)b, ' 

The integration over b in Eqn. [10] can be reduced further leading to the appearance of the 
Gauss error function. However, the integration over / in the denominator can only be done 
numerically. 

We now make Eqn. even more realistic by considering multiple measurements (say, 
A^) and an arbitrary number of filters (say, M). Using the formalism we have developed 
above, we will assume that for the j*^ measurement the fraction of light in the /c*^ filter is 
/j"^, and the total light distributed between all the filters and all the measurements is given 
by b. Therefore, the hypothesized flux in the k^^ filter and j*'* measurement is given by /j'^ b. 
Again, if the supernova is assumed to be a Type la, then we have a model that describes 
the fraction of light in each of the filters for each of the measurements must be. The model 
must take into account the many possible observational parameters that characterize an SN 
la. For example, it is known that SNe la h ave a variety of possibl e "stretch" values, which 
parame trize the width of their ligh t curve s (jPerlmutter et al.lll997l ). Following the approach 
used in iKuznetsova and ConnoUyi (120071 ). we represent the Type la supernova parameters 
by 9, defined as 
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, Ay, Ru, tdiff^ 



z). 



where s is the stretch parameter; and parametrize the effect of interstellar dust extinc- 
tion using the Cardelli-Clayton-Mathis (CCM) parametrization (ICardelli et allll998l ): tdiff 
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accounts for the difference in the time of maximum of the model and the data; and z is the 
redshift. In other words, for Type la's /j'^ will become j'j^{Q\ The exact assumptions about 
the distribution of these parameters will be discussed below when the prior probabilities for 
all of the Q parameters will be stated explicitly. 

Since, in general, the exact values of each of these parameters for a given candidate are 
unknown, they must be marginalized. If the i*^ measurement in filter k of the flux is 
with error erf, the multi-measurement, multi-filter analog of Eqn. |l]is: 

^ ^ E Jo°° dhP{{Ft}, {a\}\Q, b, l^)P{l &|Ia) 

/o rff'/o^ d6P({i^^'}, {af }|f', 6, non - Ia)P(f' , 6|non-Ia) ' 

where f = {/j^}, and di' indicates an integration over the multi-dimensional parameter 
space where XlfcliE^i/j"^ ~ 1- The denominator is not parametrized by 9 as it is not 
known what parameters are relevant for what we define as "anything other than SNe la". 
Therefore, every possible distribution of light in the filters is given an equal chance. 

We now address each term in Eqn. [12] in turn. P{{Fj^}, {af}\6, b, la) 
and P{{F^}, {af }|f', b, non — la) are the likelihoods of obtaining a set of fluxes, {Fj^}, with 
uncertainties {erf}, for a number of measurements and filters, given that the mean number of 
photons are measured to be {fj'^iO) b} and {fj'' b}, respectively. Assuming each measurement 
and every filter are independent, 

M N 

P{{Ft},W-}\0,bM = n U^i^t-J-'maf). (13) 



k=l i=l 



and 



M N 



P{{F^}. W'}\{f'}. b, non - la) = J] H ^(^'^ ^i'^' ^')- (^4) 

k=l i=l 

Note that, in general, the measured flux {Fj^) and the hypothesized flux {f'j^b) have different 
subscripts (which indicate the measurement number). This is done to emphasize the fact 
that it is unknown where the time of maximum of our measured light curve is relative to 
that of the model. This uncertainty is taken into account in one of the 9 parameters, tdj//. 

The terms P{9^ b\la) and P(f', 6|non-Ia) in Eqn. [T2lare prior probabilities. In particular, 
P(^,6|Ia) contains the prior knowledge about the parameters 9 that describe an SN la. For 



''Note that as long as the overlap between the filters is not 100%, then, without assuming anything about 
the underlying spectrum, any relative fraction of light is allowed between the two filters. 
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P(f', 6|non-Ia), f is not constrained. Likewise, parameter b is not constrained in any way 
for either a Type la or a non-la prior. It is therefore marginahzed. Integrating over b 
means integrating over the total light in all the filters and all the measurements for a given 
candidate -i.e., integrating over the observed magnitude for this measurement. Furthermore, 
in marginalizing b, there is an implicit assumption about the prior distribution of 6 - namely, 
that it is flat. This assumption allows us to formulate the probabilities purely in terms of 
color. Allowing the total light to vary measurement-by-measurement with a flat prior is 
arguably the lightest possible assumption one can make regarding the magnitude. 

Explicitly, the priors P(6', 6|Ia) and P(f' , 6|non-Ia) become: 

N M 

P(f , 6|non-Ia) = '-—6 | 1 - 17 > (15) 



^max 



11 Z-^ •'J 

j=l k=l 



since the only constraint here is that all the light fractions in different filters add up to one; 
and 

Pie, 6|Ia) = mj ^1— (16) 



-"max "mm 



where C,{0) is a the prior probability of 6. 



The priors on 6 are defined similarly to those in iKuznetsova and ConnoUyl (120071 ) and 



are briefly summarized below. The stretch parameter s follows a Gaussian dis tribution with 



a meaii of s = 0.97 and a width of 6s = 0.09 (these values are extracted from ISuUivan et al. 



(j2006bl )). The CCM parameters and Ry can assume two sets of values with equal proba- 
bilities: {A^,Ry) = (0.0,0.0) (no extinction) and (0.2,2.1) (moderate extinction). The prior 
probability for each choice of Ay and P„ is therefore N^ust = 1/2- The parameter accounting 
for the difference between the time of maximum of the data and the model, tdiff, has a flat 
prior. The measured light curve is shifted relative to a template in one day increments 1000 
times and each shift is assigned an equal probability Nt^^^■^■ = 1/1000. means that a flat 
prior is assigned to tdiff- Finally, the redshift parameter z is assumed to be known from the 
supernova candidate's host galaxy, Zgai, with an associated uncertainty of agai- We consider 
the range of redshifts from to 1.7, and assume two representative possibilities for agai, 0.005 
(which might be obtained through a spectroscopic analysis of the host galaxy's spectrum) 
and 0.1 (obtained through a photometric analysis). Therefore, 

_ 1 1 

^(9) = G{s; s, Ss)— — — G{zgai; z, agai) (17) 



Putting everything together, we obtain the full Bayes factor: 

^ ^ r dbUl^ Uti GjFt, ) 
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where Yle represents the sums and integrations over the parameters in 6 (depending on 
whether they are discrete or continuous). 

Now the calculation of Eqn. [TS] requires performing N x M integrations over /j*^ in 
the denominator. For a large number of filters (say, > 8) and many measurements, it is 
nearly impossible to do this calculation in a reasonable amount of time with the required 
precision for fj'' without the use of techniques such as Markov Chain Monte Carlo integration 
methods. However, the number of integrations can be reduced to M (the number of filters) 
if we allow for measurement-to-measurement variations in b. That is, if we allow 

b ^ h, (19) 

bi can be brought inside the product over measurements. This assumption is the equivalent 
of removing the knowledge that the colors between the measurements are known (or in the 
Poisson case, this is equivalent to having a separate multinomial for each measurement). 
This effectively releases the constraint on colors between measurements. Technically, the 
effect of this assumption should be to sweep more candidates which look less like SNe la into 
the SN la hypothesis, so if one's sample contains "anomalous" candidates that very closely 
mimic SNe la one might reconsider this conjecture. 

With Eqn. [191 Eqn. [TH] becomes 

^ ^ nil r dh UtLi GjFt; fjHO), ) .^O) 



3. Performance Studies 

3.1. The Simulated Dataset 

In order to check the performance of the method proposed above, we simulate a dataset 
closely mimicking one that could be obtained by a possible JDEM space-based mission. The 
mission is based on a 2-m class telescope, and is capable of taking multi-band photometric 
data in the wavelength range from 0.3 to 1.7 /xm. Photometric data are assumed to be taken 
every 4 days in the observer frame, with an exposure time of 1200 seconds. Note that this 
study is not meant to test the performance of any particular JDEM mission; we simply test 
the performance of the method assuming a fairly generic, plausible JDEM. 

We create a number of supernova candidates of a given type using spectral templates 
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from Hsiao et al ( 2007 ) for Type la's, and from P. E. Nugent for non-Ia'slf| The date of 



explosion for a given candidate is chosen randomly within the confines of a 3 year mission 
timeline, and supernova candidate properties are generated according to the probability dis- 
tribution functions in ^{6) (see Eqn. [T71) . The supernova light curves are then realized using a 
simple aperture exposure time calculator. We generate supernova candidates of types la, Ibc, 
II-P, and Iln. We assume that the intrinsic rest-frame 5-band magnitud es follow a Gaus 



sian distribution, with the mean and standard deviations obtained from iRichardson et al. 



(120021 ). In particular, the mean and standard deviation are taken to be —19.05 ± 0.30 mags, 
for Type la's; —17.27 ± 1.30 mags, for Type Ibc's; —19.05 ± 0.92 mags, for Type Iln's 
and —16.64 ± 1.12 mags, for Type II-P's. We also generate "anomalous" objects by creat- 
ing fake unimodal light curves (light curves that rise and fall with a single maximum, but 
are otherwise random). These light curves are assigned 1% errors in each broadband filter 
considered. 

To get a better feel for the simulated dataset. Fig. [1] shows the signal-to-noise ratios 
at maximum light as a function of redshift for the simulated SNe la in the filter that most 
closely matches the rest-frame i?-band. 




0.2 0.4 0.6 0.8 1 
Redshift 



Fig. 1. — Distribution of the signal-to- noise ratios at maximum light as a function of redshift 
for the simulated SNe la, in the filter that most closely matches the rest-frame 5-band. 

Once the light curves are simulated, we select a subset of them in a limited number 
of broadband filters. The chosen filters must include those that most closely match the 



http: / / supernova.lbl.gov/~nugent / nugent_templates.html 
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rest-frame B- and V- bands for a given candidate. The reason why we place a particular 
emphasis on these bands is because they correspond to a wavelength range where SNe la are 
particularly well modeled. To limit the computational time, we only consider 50 consecutive 
photometric measurements (the actual available number of measurements varies evenly from 
to over 350). We also require that there be at least one measurement with a signal-to- 
noise ratio > 5 in the filter most closely corresponding to the rest-frame S-band for a given 
candidate. Any space-based dark energy mission that extends to at least a year and uses 
SNe la as a dark energy probe will satisfy these requirements (in fact, every JDEM mission 
currently on the market does). 



3.2. Results 

A number of tests are used to check the performance of the method. First, we calculate 
the Bayes factor, TZ (Eqn. [2Ui) . for a sample of simulated SNe la and a sample containing 
unimodal "fake" light curves. The unimodal light curves for a given "object" peak at the 
same time in all the filter bands. To give a sense of their color distribution. Fig. [2] shows 
the fake objects' colors for the 3 lowest wavelength filters bands (the first filter covers the 
range of 0.32-0.47 fim; the second, 0.41-0.56 yUm; and the third, 0.49-0.68 fim). This test 




Fig. 2. — The distribution of colors for the fake unimodal data. Mi is the magnitude in 
filter i (filter covers the range of 0.32-0.47 fim; filter 1, 0.41-0.56 /im; and filter 2, 0.49-0.68 
fim). 



allows us to test the discrimination between SNe la and objects that are not supernovae of 
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any kind. Figure [3] shows that logR is predominantly positive for SNe la and negative for 
the random, unimodal data, meaning that TZ < 1. This means that the method behaves as 
expected, discriminating between random data and true supernovae 100% of the time. 





i 


1 


— Random data 
--■SN la 
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Fig. 3. — Distributions of \ogR for random unimodal data (solid line) and SN la's (filled 
histogram). The two histograms have been normalized to the same area for an easier shape 
comparison. 

We also compute the Bayes factor for a sample of simulated Type Ibc's, Type II-P's, 
and Type Iln's. Note that we do not have to include any information about the expected 
light curves for Type Ibc's, Type Iln's, or Type II-P's to compute TZ. Figure H] shows the 
comparison of logR for the case of measurements in 2 filters (left column) and 4 filters (right 
column). We consider the case of a 0.1 error on the redshifts (top row) and a 0.005 error on 
the redshift (bottom row). For this comparison, we assume that the errors on the supernova 
fluxes are realistic (that is what a JDEM mission described above would be expected to 
obtain). Several interesting conclusions can be drawn from Fig. |H First, it is apparent that 
the method does discriminate between SNe la and the other types, although logT^ tend to be 
larger than (so that 7^ > 1) because SNe la are far more similar to other supernovae than 
they are to anything else. It should be noted that the values of TZ tends to be quite large. 
This is due to our use of a large number of measurements (~ 50) in a number of filters, which 
ensures that a candidate is either very much identified as an SN la-like candidate or not. 
Second, as expected, the discrimination between SN la's and non-la's increases with more 
information (4 filters vs. 2 filters) and/or with better prior knowledge {i.e., smaller errors 
on the measured redshift). Third, the discrimination is somewhat worse at high redshifts 
(> ~1), as we move into a domain of less precise data and less certain models; but it is 



- 15 - 



still good enough for TZ to be used as a first-pass SN la classifier. Additionally, at least 
som e plausible JDEMs " sculpt" their expected SNe la distribution so that it peaks at 2; ~ 



0.7 flAldering et al.ll2004l ). 



One might ask how TZ would be affected if the templates used had incorrect colors for 
the SN la hypothesis. In order to answer this question, we generated SN la candidates with 
{A^,Ry) = (0.2,2.1) but use only the no-extinction templates when calculating 71. Figure[5] 
shows the distributions of TZ for the cases where the extinction parameters in the data and 
templates are matched and mis-matched. As expected, TZ decreases when the extinction in 
the templates does not match that in the data; that is, the SNe la look more similar to 
non-SNe la. 

We further check the behavior of TZ by varying the errors on the fluxes of the simulated 
SNe la to ensure that TZ changes in the right direction. This check, always a good idea for 
a newly introduced statistic, is particularly important for thi s Bayes factor, which mak es 



use of improper priors that can lead to non-intuitive behavior ( iBerger and Pericchil (120011 )). 
Figure E] shows the distribution of log??, for the "nominal" flux errors and for flux errors 
artificially increased and decreased by a factor of 2. Increasing the flux errors shifts the 
distribution to the left [i.e., the discrimination power decreases), while decreasing the flux 
errors shifts it to the right [i.e., the discrimination power increases). This is the expected 
behavior for a correctly computed TZ. 



3.3. Including Prior Knowledge on Non-la Supernova Types 

So far, we have assumed no prior knowledge of SNe models that may contribute to the 
set of observed non-SNe la. We showed that the Bayes factor described above is capable 
of discriminating between SNe la and non-SNe la as well as random unimodal light curves 
that mimic anomalous candidates. This discrimination, which does not require either the 
knowledge of a complete set of objects that can mimic an SN la signal or the knowledge of 
the possible behavior of anomalous non-supernova objects that can contaminate an SN la 
signal, is good. As Fig.H] shows, we could simply define a polynomial cut on 7?. as a function 
of TZ and have a very good discriminant. 

However, one might be interested in considering a Bayes factor for which all known 
non-SN la candidates would have TZ < 1. First, it is simply better to have a TZ that behaves 
intuitively. Second, it is obvious that including more prior knowledge {i.e., the knowledge of 
what can potentially mimic an SN la signal), can only sharpen the discrimination between 
SNe la and non-SNe la. Third, and more importantly, it is a good idea to have a discriminant 
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such that TZ < 1 when the candidate is more hkely to be an SN la tha n not , and TZ > 1 



otherwise. This allows one to use a formalism similar to that described in IWaldl (119451 . 119471 ) 



in order to set thresholds on TZ with meaningful, pre-determined Type I and II error rates. 

Explicitly including the knowledge about the behavior of non-SNe la into Eqn. [1], we 
define: 

■R.' = 

P{Phot|Ia) 

— , (21) 

P(Phot|II-P)P(n-P|non-Ia) + P(Phot |Ibc) F(Ibclnon-Ia) + P(Phot|IIii)P(IIn| non-la) + P(Phot | anything) P(anything| non-la) 

where the denominator now accounts for the probability that the observed photometry can 
come from a Type II-P supernova, P(Phot|II-P); a Type Ibc supernova, P(Phot|Ibc); or from 
a Type Iln supernova, P(Phot|IIn). P(Phot| anything) is equivalent to the denominator in 
Eqn. [201 Note that the prior terms in Eqn. [2T]are such that 

P( II-P I non-la) = P(Ibc|non-Ia) = P(IIn|non-Ia) = P(anything|non-Ia) = - (22) 

In other words, there is an equal probability of measuring a Type la supernova, a Type 
Ibc supernova, a Type II-P supernova or some other object denoted as "anything" . It is of 
course trivial to introduce relative rates if they are known; however, it is immaterial for our 
purpose, which is demonstrating the performance of the method. 

The non-Type la probabilities are calculated in exactly the same way as those for Type 
la's, using the available models for the corresponding types. The distribution of logT^' vs. 
redshift in shown in Fig. [71 for the case of 2 filters (left column) and 4 (right column) filters 
with 0.005 flux errors (bottom row), and 0.1 errors (top row) on the redshift. Randomly 
generated, uni-modal data all have large negative values log(TZ)'s that dwarf the scales on 
these figures. The errors on the flux are those expected for a JDEM mission described above. 
Compared Fig. [7] to Fig.[U it is clear that not only has the discrimination between SNe la 
and everything else increased, but also the SNe la generally have log{TZ') > and other 
candidates have log{TZ') < 0. This is the desired behavior for TZ'. 

A number of features of Fig.[7|are similar to those apparent in Fig.[l|, such as the increase 
in the discrimination power when more and/or better information becomes available. 



3.4. Dangers of a Finite Set Assumption 

As we explained in Section [1], existing Bayesian-based methods of supernova classifica- 
tion assume a finite set of possible objects that can mimic an SN la signal. To demonstrate 
the danger of this limiting assumption, we use our unimodal fake light curves and calculate 
Bayes factors defined as: 

P(Phot|Ia)P(Ia) 

~ P(Phot|Ibc)P(Ibc|non - la) + P(Phot|II - P)P(II - Pjnon - la) + P(Phot|IIn)P(IIn|non - la) 
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and 

„ _ ^ P(Phot|II-P)P(II-P) 

^ P(Phot|Ibc)P(Ibc|non-Ia)+P(Phot|Ia)P(Ia|non-Ia)+P(Phot|IIn)P(IIn|non-Ia)^ ^ 



Figure[8] shows that the fake objects can be mis-identified as supernovae of types other 
than ia (in particular, as Type II-P's), as well as SNe la: there are candidates with Rja > 1. 
This further demonstrates the need for a general a formalism that is capable of discriminating 
a certain type of supernovae (most often Type la's) from anything else. 



4. Summary 

We have introduced a new photometric supernova classification scheme that uses a 
Bayes factor based on color. The proposed method is fundamentally different from previous 



supernova classification methods including our own (IKuznetsova and Connollyll2007l ) because 
it allows one to discriminate not only between supernovae of different types but also between 
supernovae and "anomalous" objects. It has a number of definite advantages over many 
existing techniques for selecting SNe la out of a pool of supernova candidates. The main one 
is that it does not pre-suppose any prior knowledge about the objects that could potentially 
mimic a Type la signal. It can thus be used as a very good first-pass Type la classifier. With 
the current poor knowledge of the behavior of non-Type la supernovae, especially at high 
redshifts, and the expected dramatic increase in the discoveries of new, as yet unknown classes 
of transient astronomical objects, this feature of the method will be invaluable for future 
supernova surveys. This is not, however, an excuse not to obtain as much information about 
non-Type la supernovae as one possibly can, as evidenced by the advantage of computing TZ', 
which includes information about the light curves of Type la, Ibc, Iln, and II-P supernovae 
(obviously, more information means a better performance). 

Another principal advantage of the proposed method is that if the Bayes factor described 
in Section [22] is used as a discriminant, the only supernova models that are required are those 
for SNe la, which are the best studied and most complete of all the supernova types. One 
may argue that the same is true for a x^-based method. However, methods suffer from 
a number of problems described in SectionlH the least of which is that for the case of data 
with large uncertainties a given supernova candidate will appear to agree with many possible 
supernova type hypotheses, with one necessarily giving the "best" (however insignificant 
the difference between this best ^^id the x^'s from the other fits may be). The Bayes 
factor, on the other hand, would be of order 1, refiecting ambiguity between the hypothesis 
that the candidate is an SN la and that it is anything else. 
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Finally, as it is a Bayesian approach, it accounts for all systematic and statistical un- 
certainties of the measurements in all their forms. 

We demonstrated the method on a simulated dataset expected from a possible future 
large-scale space-based JDEM. It must be noted that most existing data sets include pri- 
marily SNe la; at the present time large datasets with exotic supernovae that would allow 
us to perform a statistically meaningful study of the method's performance are not available 
(although we are planning on exploring the application of this method to an existing dataset 
of non-standard supernovae in a future publication). Nevertheless, a number of interesting 
conclusions can be drawn from the simulation studies described above: 

• The method provides a 100% discrimination between SNe la and unimodal random 
data. This is encouraging, since many transient objects (such as active galactic nuclei) 
that are sometimes mistaken for supernovae tend to have a rather erratic behavior, 
deviating far more from an SN la light curve than the simple random unimodal data. 
More fundamentally, the discrimination also shows that TZ and TZ' behave as expected. 

• The discrimination between Type la's and other supernova types is near 100% at 
low redshifts when calculating loglZ. At higher redshifts. Fig. H] shows that that a 
"straight line" cut on loglZ would render a reasonably high purity and efficiency for 
SNe la. Alternatively, one could invent a more sophisticated cut {e.g., a polynomial). 
The discrimination is practically 100% at all redshifts if logT^' is used (z.e., when 
information about the behavior of Type II-P, Type Ibc, and Type Iln supernovae is 
included in calculating P ( Phot | non-la)). 

• The method performs better when more information about the data becomes available. 
For example, the discrimination between SNe la and non-SNe la increases dramatically 
when the number of filters goes from 2 to 4. This shows that, despite the use of 
improper priors, the Bayes factor behaves properly. 

• Increasing the precision on the redshift of the supernova candidates also improves the 
discrimination between Type la's and non-SNe la. 

• It is important that the data and the models used have a good match in terms of 
expected colors. For example, if the extinction assumptions in the data are different 
from those in the data, real SNe la are more likely to be classified as anomalous objects. 

• FiguresH] and [7] indicate that there is sufficient separation between various supernova 
candidates, so that this method could be used for classification in the strict sense of 
giving a type {e.g., a Type la, a Type Ibc, etc.) to each candidate, along with some 
associated probability. 
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The method does not make use of magnitudes. One might see this as an advantage if one 
does not entirely trust the distribution of intrinsic magnitudes. Insertion of absolute magni- 
tudes is possible, but the formulation of TZ iJZ') becomes inelegant and requires knowledge 
of an upper limit of intrinsic magnitudes of anomalous candidates that, by definition, have 
not been observed. It is indeed fortunate that color alone is sufficient to classify supernovae, 
allowing for a simple and elegant solution for IZ {JZ'). 

There is another reason not to include magnitudes into the formalism, at least at the 
present time. The distr ibutions of the intrinsic magnitudes for non-SNe la are at the moment 



rather poorly known (IRichardson et al.l |2002[). In f act, very little is known about high- 



redshift non-SNe la (for example, iDahlen et al.l (120081 ) remains the only measurement of the 



non-SNe la rates to redshifts of ~ 1). There exists a very real need to measure the properties 
of non-SNe la supernovae with more precision, a task that is ideally suited for existing and 
planned large-scale supernova surveys. 

It is also important to note the computational challenges in calculating TZ (JZ'). The 
number of filters used in the calculation depends on the precision needed for the integration 
of {f'j^}- It was found that for 4 filters about 150 integration points for f^'' were needed for 
a precise calculation of the denominator, P(Phot|non-Ia). This was found by increasing the 
number of integration points until the value of the denominator became stable. 

In order to complete the computations in a reasonable amount of time, it was necessary 
to perform the calculations of the Bayes factor for many candidates in parallel. The compu- 
tational feasibility also depended on approximating h hi, as was discussed in Section |2T2| 
so as to reduce the number of integrations from N x M (where is the number of fiux 
measurements and M is the number of filters) to M. This effectively required that we gave 
up information about the supernova colors measured in the various filters and allowed the 
colors to vary measurement-to- measurement. One might be concerned that this approxima- 
tion would in fact allow for a greater diversity in what is considered an SN la - that is, in 
general objects would have a greater chance of faking an SN la. However, in our studies we 
found that, at least with the level of precision of the models and simulated data used, the 
calculated Bayes factor provided desired discrimination between SNe la and other objects, 
leading us to believe that this is not a significant effect. 

We also point out that our propo sed technique is gener al enough to be used for objects 



other than supernovae. For example, [Richards et al.l (120041 ) propose a Bayesian classifier to 



differentiate quasars and stars, defined as: 

^, , , P (a; I star) P (star) , ^, 

P(star|x) = — — Vt -^—^ 7 (25) 

P(x|star)P(star) + P(x|quasar)P(quasar) 

where x is a candidate's position in a 4-dimensional color space. Although the likelihoods. 
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P(a;|star) and P(x|quasar) are obtained using a training sample derived from real data, new 
data inevitably bring new objects which could be accounted for by inserting an "anomaly" 
term similar to P(Phot|anything)P(anything) in Eqn.[5TJ This term would sweep up those 
objects that do not conform to the existing models for quasars or stars in the same way that 
P(Phot|anything)P(anything) accounts for anomalous supernova candidates; its exact form 
would depend on the nature of the statistical fluctuations in the data. 



4.1. TZ Used in the Context of an SN la Trigger 

Note that while our method relies only on photometric information about supernovae, 
some proposed future space-based dark energy missions do plan on obtaining the spectrum 
of every candidate. One possible scenario would be to obtain the spectrum of a candidate 
provided that a) it is highly likely to be an SN la, and b) it is at its peak brightness. In this 
case it is necessary to have reliable means to be able to tell whether or not a given candidate 
is most likely an SN la or not based on its pre-maximum photometric measurements alone. 
In other words, it is important to have a trigger mechanism that would photometrically select 
candidates for possible spectroscopic follow-up. Our proposed Bayes factor can be simply 
modified to allow for such a usage. Assuming that one wants to trigger on supernovae before 
maximum brightness, the Bayes factor becomes: 

,, PfPhotlla prc-max) 

1Z" = ^ , (26) 

P(Phot|Ia post-max) + F(P(Phot | II-P)P(II-P | non-la) + F(Phot|Ibc)P(Ibc| non-la) + P(Phot | anything) F(anything| non-la) 

where P(Phot|Ia pre-max) would only include supernova models with points before maxi- 
mum light, while P(Phot|Ia post-max) would only include those with points after maximum 
light. After calculating this Bayes factor, a cut would be made at, say, R" > 1 to choose 
those candidates that are likely to be SNe la and that have not yet reached maximum. 



Better still, one could use a sequential analysis technique (jWaldlll945l . 119471 ) to minimize 
the data required to make this decision while simultaneously controlling identification errors. 
This is done by setting thresholds on TZ" based on pre-selected Type I and Type II error 
rates. The demonstration of the performance of a sequential analysis-based approach will 
be the subject of a future publication. 
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Fig. 4. — Distributions of logT^ vs. redshift for Type la's (upward turned triangles), Type 
Ibc's (downward turned triangles), Type II-P's (filled circles) and Type Iln's (open circles), 
with a 0.1 error on the candidate redshifts (top row) and a 0.005 candidate redshifts (bottom 
row), for 2 filters (left plots) and 4 filters (right plots). The unimodal random data are not 
over-plotted on these figures because they all have large negative values log{Tl)^s that dwarf 
the y-axis scales. 
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Fig. 5. — The distributions of TZ for the case of the extinction mis- match between the 
data and the templates (sohd hne), and the case of the matching extinction assumptions 
for the data and the templates (dashed line). The mis- matching case clearly results in a 
worse discrimination, making SNe la look more like "anomalous" objects; this is because the 
extinction in the data is not accounted for in the set of templates used to define an SN la. 
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Fig. 6. — Distributions of logR, for simulated Type la candidates for nominal flux errors 
(solid line), flux errors increased by a factor of 2 (dashed line), and flux errors decreased by 
a factor of 2 (dot-dashed line, filled histogram). The histograms have been normalized to 
the same area to aide in the comparison of their shapes. 
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Fig. 7. — Distributions of logT^' vs. redshift for Type la's (upward turned triangles), Type 
Ibc's (downward turned triangles), Type II-P's (filled circles) and Type Iln's (open circles), 
with a 0.1 error on the candidate redshifts (top row) and a 0.005 error on the candidate 
redshifts (bottom row), for 2 filters (left plots) and 4 filters (right plots). 
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Fig. 8. — The distributions of logRja (top) and logi?//„p (bottom) for a set of unimodal 
random light curves. 



